The code below simply cleans your environment to avoid loading unnecessary functions or variables and loads the libraries used in our script. We begin by installing and loading the required packages. For BDA, we use mainly Bürkner (2020), whereas Gabry and Mahr (2020) provides support with various plots and functions to calculate credible intervals.
rm( list = ls() ) # Cleans the environment.
# You can install packages in case they are not installed already.
# In case you don't have rstan installed, see:
# https://github.com/stan-dev/rstan/wiki/RStan-Getting-Started
# install.packages( c(("skimr", "ggdag", "dagitty", "loo", "tidyverse", "patchwork",
# "brms", "bayesplot", "ggthemes")
library(skimr) # For getting data summaries
library(tidyverse) # For transforming and visualizing data.
library(ggthemes) # Themes for ggplot
ggplot2::theme_set(theme_tufte())
library(patchwork) # Combining many plots into the same figure.
library(ggdag) # For DAG analysis
library(dagitty) # For DAG analysis
library(brms) # BDA packages. Alternatively, one can use rethinking & rstanarm.
library(loo) # For comparing different models' performance
library(bayesplot) # Plotting BDA output by Gabry et al.
bayesplot::color_scheme_set("viridis") #Uses the viridis palette on bayesplots
For reproducibility and efficiency we set an arbitrary seed, sampling parameters and the number of cores to speed up the MCMC sampling.
SAMPLES = 5000
WARMUP = 1000
CHAINS = 4
SEED = 2020
DELTA = 0.99
TREE = 13
set.seed(SEED)
options(mc.cores = parallel::detectCores())
First we load the data and take a look at it. As this report is analyzing the runtime differences between Bugspots and Linespots, we focus only on those variables relevant to runtime.
d = read_delim(
'../data/full_evaluation.csv',
delim = ",",
locale = locale(decimal_mark = "."),
col_names = TRUE,
col_types = cols(
AUCEC1 = col_double(),
AUCEC100 = col_double(),
AUCEC20 = col_double(),
AUCEC5 = col_double(),
AUROC = col_double(),
Algorithm = col_factor(),
Depth = col_double(),
EInspect10 = col_double(),
EInspect100 = col_double(),
EInspect200 = col_double(),
EInspect50 = col_double(),
EInspectF = col_double(),
EXAM = col_double(),
FixCount = col_double(),
Future = col_double(),
LOC = col_double(),
Origin = col_double(),
Project = col_factor(),
Runtime = col_double(),
Time = col_factor(),
Weight = col_factor(),
comit_version = col_factor(),
commits = col_double(),
language = col_factor(),
url = col_factor()
)
)
d = data.frame("Runtime" = d$Runtime, "Algorithm" = d$Algorithm, "LOC" = d$LOC,
"FixCount" = d$FixCount, "Project" = d$Project, "language" = d$language)
| Name | d |
| Number of rows | 486 |
| Number of columns | 6 |
| _______________________ | |
| Column type frequency: | |
| factor | 3 |
| numeric | 3 |
| ________________________ | |
| Group variables | None |
Variable type: factor
| skim_variable | n_missing | complete_rate | ordered | n_unique | top_counts |
|---|---|---|---|---|---|
| Algorithm | 0 | 1 | FALSE | 2 | Lin: 243, Bug: 243 |
| Project | 0 | 1 | FALSE | 32 | mys: 18, woo: 18, pre: 18, ser: 18 |
| language | 0 | 1 | FALSE | 8 | Jav: 126, Jav: 72, Typ: 66, C: 54 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| Runtime | 0 | 1 | 6.60 | 16.88 | 0 | 0.01 | 0.05 | 0.20 | 125.66 | ▇▁▁▁▁ |
| LOC | 0 | 1 | 1658395.80 | 3014179.45 | 11741 | 245703.00 | 619122.00 | 1554904.00 | 20705587.00 | ▇▁▁▁▁ |
| FixCount | 0 | 1 | 90.66 | 72.62 | 0 | 39.25 | 68.00 | 113.75 | 418.00 | ▇▃▁▁▁ |
There seem to be some cases where no faults were found in the pseudo future. As those cases can’t tell us anything, we will remove them.
d = subset(d, d$FixCount != 0)
With LOC spanning multiple orders of magnitude, we standardize it to improve sampling performance.
d$LOC = scale(d$LOC)
d$FixCount = scale(d$FixCount)
Based on the the data we gathered, we built a DAG, representing the causal relationships as we assume them. In this case, the analysis is rather simple. We designed the experiment in such a way, that there is no incoming causal relationship to algorithm so we could use all parameters without confounding problems.
The graph shows that there is only a single possible causal path from ‘Algorithm’ to ‘Runtime’, so regardless of which other predictor we add to a model, they will not add bias or confounding.
Finally, we can also explicitly test the four sets of predictors we plan on using for being adjustment sets.
isAdjustmentSet(Runtime_dag, c("LOC"))
isAdjustmentSet(Runtime_dag, c("LOC", "FixCount"))
isAdjustmentSet(Runtime_dag, c("LOC", "FixCount", "Project"))
isAdjustmentSet(Runtime_dag, c("LOC", "FixCount", "Project", "Language"))
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
We are interested in Runtime as our outcome, Algorithm as our exposure and control for LOC, Project and language.
We build generalized linear models (GLM) for different predictor combinations and compare different likelihoods.
\[ \begin{split} \mathcal{M_i}: \mathrm{Runtime_i} \thicksim &\ \mathrm{Lognormal}(\mu_i, \sigma_i) \\ \mu_i = &\ \alpha + \beta_A \mathrm{Algorithm} + \beta_L \mathrm{LOC}\\ \alpha \thicksim &\ \mathrm{Normal(0,5)} \\ \beta_A, \beta_L \thicksim &\ \mathrm{Normal(0, 2)} \\ \sigma_i \thicksim &\ \mathrm{gamma(0.1, 0.01)} \\ \end{split} \]
ln2p = brm(
bf(Runtime ~ 1 + Algorithm + LOC + FixCount),
data = d,
family = lognormal(),
prior = c(
prior(normal(0, 5), class=Intercept),
prior(normal(0, 1), class=b),
prior(gamma(0.1, 0.01), class=sigma)
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = "only",
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
pp_check(ln2p, nsamples = 100) + scale_x_continuous(trans='log10')
ln2 = brm(
bf(Runtime ~ 1 + Algorithm + LOC + FixCount),
data = d,
family = lognormal(),
prior = c(
prior(normal(0, 5), class=Intercept),
prior(normal(0, 1), class=b),
prior(gamma(0.1, 0.01), class=sigma)
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = FALSE,
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
summary(ln2)
## Family: lognormal
## Links: mu = identity; sigma = identity
## Formula: Runtime ~ 1 + Algorithm + LOC + FixCount
## Data: d (Number of observations: 480)
## Samples: 4 chains, each with iter = 5000; warmup = 1000; thin = 1;
## total post-warmup samples = 16000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -0.08 0.12 -0.32 0.15 1.00 14814 10596
## AlgorithmBugspots -4.42 0.17 -4.74 -4.09 1.00 15913 11102
## LOC 0.30 0.08 0.14 0.47 1.00 16722 11611
## FixCount -0.43 0.08 -0.60 -0.27 1.00 16529 11444
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 1.85 0.06 1.73 1.97 1.00 16425 11423
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Results:
pp_check(ln2, nsamples = 100) + scale_x_continuous(trans='log10')
mcmc_areas(ln2, pars = c("b_AlgorithmBugspots"), prob = 0.95)
eff = conditional_effects(ln2, effects = c("Algorithm"))
eff$Algorithm
eff
And Diagnostics: All diagnostics look good.
min(neff_ratio(ln2))
## [1] 0.4417685
max(rhat(ln2))
## [1] 1.000106
plot(ln2)
ln1p = brm(
bf(Runtime ~ 1 + Algorithm + LOC),
data = d,
family = lognormal(),
prior = c(
prior(normal(0, 5), class=Intercept),
prior(normal(0, 1), class=b),
prior(gamma(0.1, 0.01), class=sigma)
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = "only",
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
pp_check(ln1p, nsamples = 100) + scale_x_continuous(trans='log10')
ln1 = brm(
bf(Runtime ~ 1 + Algorithm + LOC),
data = d,
family = lognormal(),
prior = c(
prior(normal(0, 5), class=Intercept),
prior(normal(0, 1), class=b),
prior(gamma(0.1, 0.01), class=sigma)
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = FALSE,
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
summary(ln1)
## Family: lognormal
## Links: mu = identity; sigma = identity
## Formula: Runtime ~ 1 + Algorithm + LOC
## Data: d (Number of observations: 480)
## Samples: 4 chains, each with iter = 5000; warmup = 1000; thin = 1;
## total post-warmup samples = 16000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -0.08 0.12 -0.32 0.15 1.00 14568 9818
## AlgorithmBugspots -4.41 0.17 -4.75 -4.08 1.00 14778 10190
## LOC 0.29 0.09 0.12 0.47 1.00 14974 9897
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 1.90 0.06 1.78 2.02 1.00 14788 9936
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
pp_check(ln1, nsamples = 100) + scale_x_continuous(trans='log10')
ln3p = brm(
bf(Runtime ~ 1 + Algorithm + LOC + + FixCount + (1 | Project)),
data = d,
family = lognormal(),
prior = c(
prior(normal(0, 5), class=Intercept),
prior(normal(0, 1), class=b),
prior(gamma(0.1, 0.01), class =sd),
prior(gamma(0.1, 0.01), class=sigma)
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = "only",
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
pp_check(ln3p, nsamples = 100) + scale_x_continuous(trans='log10')
ln3 = brm(
bf(Runtime ~ 1 + Algorithm + LOC + FixCount + (1 | Project)),
data = d,
family = lognormal(),
prior = c(
prior(normal(0, 5), class=Intercept),
prior(normal(0, 1), class=b),
prior(gamma(0.1, 0.01), class =sd),
prior(gamma(0.1, 0.01), class=sigma)
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = FALSE,
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
summary(ln3)
## Family: lognormal
## Links: mu = identity; sigma = identity
## Formula: Runtime ~ 1 + Algorithm + LOC + FixCount + (1 | Project)
## Data: d (Number of observations: 480)
## Samples: 4 chains, each with iter = 5000; warmup = 1000; thin = 1;
## total post-warmup samples = 16000
##
## Group-Level Effects:
## ~Project (Number of levels: 32)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.03 0.06 0.00 0.23 1.00 3808 6006
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -0.08 0.12 -0.32 0.15 1.00 19619 12074
## AlgorithmBugspots -4.42 0.17 -4.74 -4.09 1.00 21824 11771
## LOC 0.30 0.08 0.14 0.47 1.00 19486 11344
## FixCount -0.43 0.09 -0.60 -0.27 1.00 17513 10479
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 1.85 0.06 1.74 1.97 1.00 21119 12771
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
pp_check(ln3, nsamples = 100) + scale_x_continuous(trans='log10')
ln4p = brm(
bf(Runtime ~ 1 + Algorithm + LOC + + FixCount + (1 | Project)),
data = d,
family = lognormal(),
prior = c(
prior(normal(0, 5), class=Intercept),
prior(normal(0, 1), class=b),
prior(gamma(0.1, 0.01), class =sd),
prior(gamma(0.1, 0.01), class=sigma)
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = "only",
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
pp_check(ln4p, nsamples = 100) + scale_x_continuous(trans='log10')
ln4 = brm(
bf(Runtime ~ 1 + Algorithm + LOC + FixCount + (1 | Project)),
data = d,
family = lognormal(),
prior = c(
prior(normal(0, 5), class=Intercept),
prior(normal(0, 1), class=b),
prior(gamma(0.1, 0.01), class =sd),
prior(gamma(0.1, 0.01), class=sigma)
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = FALSE,
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
summary(ln4)
## Family: lognormal
## Links: mu = identity; sigma = identity
## Formula: Runtime ~ 1 + Algorithm + LOC + FixCount + (1 | Project)
## Data: d (Number of observations: 480)
## Samples: 4 chains, each with iter = 5000; warmup = 1000; thin = 1;
## total post-warmup samples = 16000
##
## Group-Level Effects:
## ~Project (Number of levels: 32)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.03 0.06 0.00 0.23 1.00 3808 6006
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -0.08 0.12 -0.32 0.15 1.00 19619 12074
## AlgorithmBugspots -4.42 0.17 -4.74 -4.09 1.00 21824 11771
## LOC 0.30 0.08 0.14 0.47 1.00 19486 11344
## FixCount -0.43 0.09 -0.60 -0.27 1.00 17513 10479
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 1.85 0.06 1.74 1.97 1.00 21119 12771
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
pp_check(ln4, nsamples = 100) + scale_x_continuous(trans='log10')
loo_ln1 = loo(ln1)
loo_ln2 = loo(ln2)
loo_ln3 = loo(ln3)
loo_ln4 = loo(ln4)
loo_compare(loo_ln1, loo_ln2, loo_ln3, loo_ln4)
## elpd_diff se_diff
## ln2 0.0 0.0
## ln3 0.0 0.1
## ln4 0.0 0.1
## ln1 -12.1 4.0
We use a shifted lognormal distribution as a runtime is a kind of response time. For the shift parameter, we use a uniform prior up to double the highest measured Runtime. \[ \begin{split} \mathcal{M}_1: \mathrm{Runtime_i} \thicksim &\ \mathrm{Shifted-Lognormal}(\mu_i, \sigma_i, ndt_i) \\ \mu_i = &\ \alpha + \beta_A \mathrm{Algorithm} + \beta_L \mathrm{LOC}\\ \alpha \thicksim &\ \mathrm{Normal(0,5)} \\ \beta_A, \beta_L \thicksim &\ \mathrm{Normal(0, 2)} \\ \sigma_i \thicksim &\ \mathrm{gamma(0.1, 0.01)} \\ ndt_i \thicksim &\ \mathrm{Uniform(0,210)} \\ \end{split} \]
sln2p = brm(
bf(Runtime ~ 1 + Algorithm + LOC + FixCount),
data = d,
family = shifted_lognormal(),
prior = c(
prior(normal(0, 2), class=Intercept),
prior(normal(0, 1), class=b),
prior(gamma(0.1, 0.01), class=sigma),
prior(uniform(0, 210), class=ndt)
),
init = 0, chains = CHAINS, seed = SEED,
cores = parallel::detectCores(),
sample_prior = "only",
)
pp_check(sln2p, nsamples = 100) + scale_x_continuous(trans='log10')
sln2 = brm(
bf(Runtime ~ 1 + Algorithm + LOC + FixCount),
data = d,
family = shifted_lognormal(),
prior = c(
prior(normal(0, 2), class=Intercept),
prior(normal(0, 1), class=b),
prior(gamma(0.1, 0.01), class=sigma),
prior(uniform(0, 210), class=ndt)
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = FALSE,
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
summary(sln2)
## Family: shifted_lognormal
## Links: mu = identity; sigma = identity; ndt = identity
## Formula: Runtime ~ 1 + Algorithm + LOC + FixCount
## Data: d (Number of observations: 480)
## Samples: 4 chains, each with iter = 5000; warmup = 1000; thin = 1;
## total post-warmup samples = 16000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -0.11 0.13 -0.36 0.14 1.00 17858 11891
## AlgorithmBugspots -4.94 0.18 -5.28 -4.60 1.00 17416 12038
## LOC 0.36 0.09 0.18 0.54 1.00 17424 11734
## FixCount -0.41 0.09 -0.58 -0.24 1.00 16514 11491
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 1.93 0.06 1.81 2.06 1.00 17460 10710
## ndt 0.00 0.00 0.00 0.00 1.00 15418 9556
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Results:
pp_check(sln2, nsamples = 100) + scale_x_continuous(trans='log10')
mcmc_areas(sln2, pars = c("b_AlgorithmBugspots"), prob = 0.95)
eff = conditional_effects(sln2, effects = c("Algorithm"))
eff$Algorithm
eff
And Diagnostics: All diagnostics look good.
min(neff_ratio(sln2))
## [1] 0.4468795
max(rhat(sln2))
## [1] 1.000507
plot(sln2)
sln1p = brm(
bf(Runtime ~ 1 + Algorithm + LOC),
data = d,
family = shifted_lognormal(),
prior = c(
prior(normal(0, 2), class=Intercept),
prior(normal(0, 1), class=b),
prior(gamma(0.1, 0.01), class=sigma),
prior(uniform(0, 210), class=ndt)
),
init = 0, chains = CHAINS, seed = SEED,
cores = parallel::detectCores(),
sample_prior = "only",
)
pp_check(sln1p, nsamples = 100) + scale_x_continuous(trans='log10')
sln1 = brm(
bf(Runtime ~ 1 + Algorithm + LOC),
data = d,
family = shifted_lognormal(),
prior = c(
prior(normal(0, 2), class=Intercept),
prior(normal(0, 1), class=b),
prior(gamma(0.1, 0.01), class=sigma),
prior(uniform(0, 210), class=ndt)
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = FALSE,
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
summary(sln1)
## Family: shifted_lognormal
## Links: mu = identity; sigma = identity; ndt = identity
## Formula: Runtime ~ 1 + Algorithm + LOC
## Data: d (Number of observations: 480)
## Samples: 4 chains, each with iter = 5000; warmup = 1000; thin = 1;
## total post-warmup samples = 16000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -0.11 0.13 -0.36 0.14 1.00 16300 11098
## AlgorithmBugspots -4.93 0.18 -5.28 -4.58 1.00 16523 10798
## LOC 0.35 0.09 0.17 0.53 1.00 16509 10836
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 1.97 0.06 1.85 2.11 1.00 17179 11359
## ndt 0.00 0.00 0.00 0.00 1.00 13319 8904
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
pp_check(sln1, nsamples = 100) + scale_x_continuous(trans='log10')
sln3p = brm(
bf(Runtime ~ 1 + Algorithm + LOC + FixCount + (1 | Project)),
data = d,
family = shifted_lognormal(),
prior = c(
prior(normal(0, 2), class=Intercept),
prior(normal(0, 1), class=b),
prior(gamma(0.1, 0.01), class=sd),
prior(gamma(0.1, 0.01), class=sigma),
prior(uniform(0, 210), class=ndt)
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = "only",
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
pp_check(sln3p, nsamples = 100) + scale_x_continuous(trans='log10')
sln3 = brm(
bf(Runtime ~ 1 + Algorithm + LOC + FixCount + (1 | Project)),
data = d,
family = shifted_lognormal(),
prior = c(
prior(normal(0, 2), class=Intercept),
prior(normal(0, 1), class=b),
prior(gamma(0.1, 0.01), class=sd),
prior(gamma(0.1, 0.01), class=sigma),
prior(uniform(0, 210), class=ndt)
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = FALSE,
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
summary(sln3)
## Family: shifted_lognormal
## Links: mu = identity; sigma = identity; ndt = identity
## Formula: Runtime ~ 1 + Algorithm + LOC + FixCount + (1 | Project)
## Data: d (Number of observations: 480)
## Samples: 4 chains, each with iter = 5000; warmup = 1000; thin = 1;
## total post-warmup samples = 16000
##
## Group-Level Effects:
## ~Project (Number of levels: 32)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.04 0.09 0.00 0.32 1.00 2905 4658
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -0.11 0.13 -0.36 0.13 1.00 29204 11861
## AlgorithmBugspots -4.94 0.17 -5.27 -4.61 1.00 32669 11427
## LOC 0.36 0.09 0.19 0.54 1.00 22952 9440
## FixCount -0.41 0.09 -0.59 -0.24 1.00 23062 10518
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 1.93 0.06 1.81 2.06 1.00 30219 12034
## ndt 0.00 0.00 0.00 0.00 1.00 28122 10744
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
pp_check(sln3, nsamples = 100) + scale_x_continuous(trans='log10')
sln4p = brm(
bf(Runtime ~ 1 + Algorithm + LOC + FixCount + (1 | Project) + (1 | language)),
data = d,
family = shifted_lognormal(),
prior = c(
prior(normal(0, 2), class=Intercept),
prior(normal(0, 1), class=b),
prior(gamma(0.1, 0.01), class=sd),
prior(gamma(0.1, 0.01), class=sigma),
prior(uniform(0, 210), class=ndt)
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = "only",
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
pp_check(sln4p, nsamples = 100) + scale_x_continuous(trans='log10')
sln4 = brm(
bf(Runtime ~ 1 + Algorithm + LOC + FixCount + (1 | Project) + (1 | language)),
data = d,
family = shifted_lognormal(),
prior = c(
prior(normal(0, 2), class=Intercept),
prior(normal(0, 1), class=b),
prior(gamma(0.1, 0.01), class=sd),
prior(gamma(0.1, 0.01), class=sigma),
prior(uniform(0, 210), class=ndt)
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = FALSE,
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
summary(sln4)
## Family: shifted_lognormal
## Links: mu = identity; sigma = identity; ndt = identity
## Formula: Runtime ~ 1 + Algorithm + LOC + FixCount + (1 | Project) + (1 | language)
## Data: d (Number of observations: 480)
## Samples: 4 chains, each with iter = 5000; warmup = 1000; thin = 1;
## total post-warmup samples = 16000
##
## Group-Level Effects:
## ~language (Number of levels: 8)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.02 0.06 0.00 0.22 1.00 3932 6090
##
## ~Project (Number of levels: 32)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.04 0.08 0.00 0.30 1.00 3073 5399
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -0.11 0.13 -0.36 0.14 1.00 18973 8360
## AlgorithmBugspots -4.94 0.17 -5.28 -4.59 1.00 26219 11408
## LOC 0.36 0.09 0.19 0.54 1.00 21215 10843
## FixCount -0.41 0.09 -0.59 -0.24 1.00 22685 10786
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 1.93 0.06 1.81 2.06 1.00 22711 10621
## ndt 0.00 0.00 0.00 0.00 1.00 25206 10472
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
pp_check(sln4, nsamples = 100) + scale_x_continuous(trans='log10')
loo_sln1 = loo(sln1)
loo_sln2 = loo(sln2)
loo_sln3 = loo(sln3, reloo=TRUE)
loo_sln4 = loo(sln4)
loo_compare(loo_sln1, loo_sln2, loo_sln3, loo_sln4)
## elpd_diff se_diff
## sln3 0.0 0.0
## sln2 0.0 0.1
## sln4 -0.1 0.1
## sln1 -9.8 3.7
We use a shifted lognormal distribution as a runtime is a kind of response time.
\[ \begin{split} \mathcal{M}_1: \mathrm{Runtime_i} \thicksim &\ \mathrm{Mixture}(\mathrm{Lognormal}(\mu_{1i}, \sigma_{1i}), \mathrm{Lognormal}(\mu_{2i}, \sigma_{2i})) \\ \mu_{1i} = &\ \alpha_1 + \beta_{A_1} \mathrm{Algorithm} + \beta_{L_1} \mathrm{LOC}\\ \mu_{2i} = &\ \alpha_2 + \beta_{A_2} \mathrm{Algorithm} + \beta_{L_2} \mathrm{LOC}\\ \alpha_1, \alpha_2 \thicksim &\ \mathrm{Normal(0,5)} \\ \beta_{A_1}, \beta_{A_2}, \beta_{L_1}, \beta_{L_2} \thicksim &\ \mathrm{Normal(0, 2)} \\ \sigma_{1i}, \sigma_{2i} \thicksim &\ \mathrm{gamma(0.1, 0.01)} \end{split} \]
mln3p = brm(
Runtime ~ 1 + Algorithm + LOC + FixCount + (1 | Project),
data = d,
family = mixture(lognormal(), lognormal()),
prior = c(
prior(normal(-2, 1), class = "Intercept", dpar = "mu1"),
prior(normal(2, 1), class = "Intercept", dpar = "mu2"),
prior(normal(0, 4), class = "b", dpar = "mu1"),
prior(normal(0, 4), class = "b", dpar = "mu2"),
prior(weibull(2, 1), class = "sd", dpar = "mu1"),
prior(weibull(2, 1), class = "sd", dpar = "mu2"),
prior(weibull(2, 1), class = "sigma1"),
prior(weibull(2, 1), class = "sigma2")
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = "only",
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
pp_check(mln3p, nsamples = 100) + scale_x_continuous(trans='log10')
mln3 = brm(
Runtime ~ 1 + Algorithm + LOC + FixCount + (1 | Project),
data = d,
family = mixture(lognormal(), lognormal()),
prior = c(
prior(normal(-2, 1), class = "Intercept", dpar = "mu1"),
prior(normal(2, 1), class = "Intercept", dpar = "mu2"),
prior(normal(0, 4), class = "b", dpar = "mu1"),
prior(normal(0, 4), class = "b", dpar = "mu2"),
prior(weibull(2, 1), class = "sd", dpar = "mu1"),
prior(weibull(2, 1), class = "sd", dpar = "mu2"),
prior(weibull(2, 1), class = "sigma1"),
prior(weibull(2, 1), class = "sigma2")
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = FALSE,
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
summary(mln3)
## Family: mixture(lognormal, lognormal)
## Links: mu1 = identity; sigma1 = identity; mu2 = identity; sigma2 = identity; theta1 = identity; theta2 = identity
## Formula: Runtime ~ 1 + Algorithm + LOC + FixCount + (1 | Project)
## Data: d (Number of observations: 480)
## Samples: 4 chains, each with iter = 5000; warmup = 1000; thin = 1;
## total post-warmup samples = 16000
##
## Group-Level Effects:
## ~Project (Number of levels: 32)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(mu1_Intercept) 0.53 0.08 0.40 0.70 1.00 4031 7387
## sd(mu2_Intercept) 0.48 0.10 0.30 0.69 1.00 4217 7418
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## mu1_Intercept -1.94 0.10 -2.13 -1.75 1.00 1474
## mu2_Intercept 3.34 0.11 3.13 3.56 1.00 7168
## mu1_AlgorithmBugspots -2.54 0.04 -2.62 -2.46 1.00 20120
## mu1_LOC 0.26 0.05 0.16 0.35 1.00 10773
## mu1_FixCount 0.07 0.03 0.02 0.12 1.00 14712
## mu2_AlgorithmBugspots -7.84 0.12 -8.07 -7.60 1.00 15075
## mu2_LOC 0.13 0.07 -0.01 0.27 1.00 9369
## mu2_FixCount -0.09 0.08 -0.24 0.07 1.00 15834
## Tail_ESS
## mu1_Intercept 2818
## mu2_Intercept 9894
## mu1_AlgorithmBugspots 11841
## mu1_LOC 10649
## mu1_FixCount 12291
## mu2_AlgorithmBugspots 12628
## mu2_LOC 11306
## mu2_FixCount 13366
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma1 0.32 0.01 0.29 0.35 1.00 20047 10228
## sigma2 0.48 0.04 0.41 0.56 1.00 7555 10751
## theta1 0.67 0.03 0.62 0.72 1.00 21588 11461
## theta2 0.33 0.03 0.28 0.38 1.00 21588 11461
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Results:
pp_check(mln3, nsamples = 100) + scale_x_continuous(trans='log10')
mcmc_areas(mln3, pars = c("b_mu1_AlgorithmBugspots", "b_mu2_AlgorithmBugspots"), prob = 0.95)
eff = conditional_effects(mln3, effects = c("Algorithm"))
eff$Algorithm
eff
And Diagnostics: All diagnostics look good.
min(neff_ratio(mln3))
## [1] 0.09190327
max(rhat(mln3))
## [1] 1.003985
plot(mln3)
mln1p = brm(
Runtime ~ 1 + Algorithm + LOC,
data = d,
family = mixture(lognormal(), lognormal()),
prior = c(
prior(normal(-2, 1), class = "Intercept", dpar = "mu1"),
prior(normal(2, 1), class = "Intercept", dpar = "mu2"),
prior(normal(0, 4), class = "b", dpar = "mu1"),
prior(normal(0, 4), class = "b", dpar = "mu2"),
prior(weibull(2, 1), class = "sigma1"),
prior(weibull(2, 1), class = "sigma2")
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = "only",
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
pp_check(mln1p, nsamples = 100) + scale_x_continuous(trans='log10')
mln1 = brm(
Runtime ~ 1 + Algorithm + LOC,
data = d,
family = mixture(lognormal(), lognormal()),
prior = c(
prior(normal(-2, 1), class = "Intercept", dpar = "mu1"),
prior(normal(2, 1), class = "Intercept", dpar = "mu2"),
prior(normal(0, 4), class = "b", dpar = "mu1"),
prior(normal(0, 4), class = "b", dpar = "mu2"),
prior(weibull(2, 1), class = "sigma1"),
prior(weibull(2, 1), class = "sigma2")
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = FALSE,
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
summary(mln1)
## Family: mixture(lognormal, lognormal)
## Links: mu1 = identity; sigma1 = identity; mu2 = identity; sigma2 = identity; theta1 = identity; theta2 = identity
## Formula: Runtime ~ 1 + Algorithm + LOC
## Data: d (Number of observations: 480)
## Samples: 4 chains, each with iter = 5000; warmup = 1000; thin = 1;
## total post-warmup samples = 16000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## mu1_Intercept -1.96 0.04 -2.04 -1.88 1.00 19209
## mu2_Intercept 3.36 0.07 3.23 3.49 1.00 26247
## mu1_AlgorithmBugspots -2.62 0.06 -2.74 -2.49 1.00 15770
## mu1_LOC 0.28 0.03 0.22 0.34 1.00 18261
## mu2_AlgorithmBugspots -7.88 0.12 -8.11 -7.65 1.00 14389
## mu2_LOC 0.08 0.05 -0.02 0.17 1.00 16827
## Tail_ESS
## mu1_Intercept 11735
## mu2_Intercept 12847
## mu1_AlgorithmBugspots 11736
## mu1_LOC 11921
## mu2_AlgorithmBugspots 11363
## mu2_LOC 12135
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma1 0.50 0.02 0.46 0.54 1.00 17079 12530
## sigma2 0.62 0.04 0.55 0.70 1.00 16315 11634
## theta1 0.66 0.03 0.59 0.71 1.00 17720 11912
## theta2 0.34 0.03 0.29 0.41 1.00 17720 11912
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
mln2p = brm(
Runtime ~ 1 + Algorithm + LOC + FixCount,
data = d,
family = mixture(lognormal(), lognormal()),
prior = c(
prior(normal(-2, 1), class = "Intercept", dpar = "mu1"),
prior(normal(2, 1), class = "Intercept", dpar = "mu2"),
prior(normal(0, 4), class = "b", dpar = "mu1"),
prior(normal(0, 4), class = "b", dpar = "mu2"),
prior(weibull(2, 1), class = "sigma1"),
prior(weibull(2, 1), class = "sigma2")
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = "only",
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
pp_check(mln2p, nsamples = 100) + scale_x_continuous(trans='log10')
mln2 = brm(
Runtime ~ 1 + Algorithm + LOC + FixCount,
data = d,
family = mixture(lognormal(), lognormal()),
prior = c(
prior(normal(-2, 1), class = "Intercept", dpar = "mu1"),
prior(normal(2, 1), class = "Intercept", dpar = "mu2"),
prior(normal(0, 4), class = "b", dpar = "mu1"),
prior(normal(0, 4), class = "b", dpar = "mu2"),
prior(weibull(2, 1), class = "sigma1"),
prior(weibull(2, 1), class = "sigma2")
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = FALSE,
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
summary(mln2)
## Family: mixture(lognormal, lognormal)
## Links: mu1 = identity; sigma1 = identity; mu2 = identity; sigma2 = identity; theta1 = identity; theta2 = identity
## Formula: Runtime ~ 1 + Algorithm + LOC + FixCount
## Data: d (Number of observations: 480)
## Samples: 4 chains, each with iter = 5000; warmup = 1000; thin = 1;
## total post-warmup samples = 16000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## mu1_Intercept -1.97 0.04 -2.05 -1.89 1.00 20177
## mu2_Intercept 3.32 0.07 3.17 3.46 1.00 21043
## mu1_AlgorithmBugspots -2.61 0.06 -2.74 -2.49 1.00 16779
## mu1_LOC 0.28 0.03 0.21 0.34 1.00 19178
## mu1_FixCount 0.03 0.03 -0.02 0.09 1.00 20844
## mu2_AlgorithmBugspots -7.82 0.12 -8.07 -7.58 1.00 14630
## mu2_LOC 0.08 0.05 -0.02 0.18 1.00 19783
## mu2_FixCount -0.09 0.07 -0.23 0.04 1.00 17768
## Tail_ESS
## mu1_Intercept 12492
## mu2_Intercept 13240
## mu1_AlgorithmBugspots 11501
## mu1_LOC 12206
## mu1_FixCount 11289
## mu2_AlgorithmBugspots 12117
## mu2_LOC 11941
## mu2_FixCount 11757
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma1 0.50 0.02 0.46 0.54 1.00 20017 11785
## sigma2 0.62 0.04 0.55 0.70 1.00 19450 11661
## theta1 0.66 0.03 0.60 0.72 1.00 16141 11895
## theta2 0.34 0.03 0.28 0.40 1.00 16141 11895
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
mln4p = brm(
Runtime ~ 1 + Algorithm + LOC + FixCount + (1 | Project) + (1 | language),
data = d,
family = mixture(lognormal(), lognormal()),
prior = c(
prior(normal(-2, 1), class = "Intercept", dpar = "mu1"),
prior(normal(2, 1), class = "Intercept", dpar = "mu2"),
prior(normal(0, 4), class = "b", dpar = "mu1"),
prior(normal(0, 4), class = "b", dpar = "mu2"),
prior(weibull(2, 1), class = "sd", dpar = "mu1"),
prior(weibull(2, 1), class = "sd", dpar = "mu2"),
prior(weibull(2, 1), class = "sigma1"),
prior(weibull(2, 1), class = "sigma2")
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = "only",
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
pp_check(mln4p, nsamples = 100) + scale_x_continuous(trans='log10')
mln4 = brm(
Runtime ~ 1 + Algorithm + LOC + FixCount + (1 | Project) + (1 | language),
data = d,
family = mixture(lognormal(), lognormal()),
prior = c(
prior(normal(-2, 1), class = "Intercept", dpar = "mu1"),
prior(normal(2, 1), class = "Intercept", dpar = "mu2"),
prior(normal(0, 4), class = "b", dpar = "mu1"),
prior(normal(0, 4), class = "b", dpar = "mu2"),
prior(weibull(2, 1), class = "sd", dpar = "mu1"),
prior(weibull(2, 1), class = "sd", dpar = "mu2"),
prior(weibull(2, 1), class = "sigma1"),
prior(weibull(2, 1), class = "sigma2")
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = FALSE,
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
summary(mln4)
## Family: mixture(lognormal, lognormal)
## Links: mu1 = identity; sigma1 = identity; mu2 = identity; sigma2 = identity; theta1 = identity; theta2 = identity
## Formula: Runtime ~ 1 + Algorithm + LOC + FixCount + (1 | Project) + (1 | language)
## Data: d (Number of observations: 480)
## Samples: 4 chains, each with iter = 5000; warmup = 1000; thin = 1;
## total post-warmup samples = 16000
##
## Group-Level Effects:
## ~language (Number of levels: 8)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(mu1_Intercept) 0.32 0.18 0.06 0.75 1.00 2819 6147
## sd(mu2_Intercept) 0.45 0.21 0.13 0.95 1.00 4689 5630
##
## ~Project (Number of levels: 32)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(mu1_Intercept) 0.51 0.08 0.38 0.68 1.00 5960 9039
## sd(mu2_Intercept) 0.39 0.10 0.22 0.60 1.00 4820 6042
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## mu1_Intercept -1.93 0.16 -2.23 -1.60 1.00 7015
## mu2_Intercept 3.39 0.21 3.03 3.86 1.00 9012
## mu1_AlgorithmBugspots -2.53 0.04 -2.61 -2.46 1.00 23374
## mu1_LOC 0.25 0.05 0.16 0.35 1.00 17048
## mu1_FixCount 0.08 0.02 0.03 0.12 1.00 21199
## mu2_AlgorithmBugspots -7.86 0.12 -8.09 -7.62 1.00 18986
## mu2_LOC 0.17 0.07 0.02 0.31 1.00 14030
## mu2_FixCount -0.09 0.08 -0.24 0.06 1.00 19778
## Tail_ESS
## mu1_Intercept 7304
## mu2_Intercept 8884
## mu1_AlgorithmBugspots 11795
## mu1_LOC 11628
## mu1_FixCount 13119
## mu2_AlgorithmBugspots 13038
## mu2_LOC 12235
## mu2_FixCount 13009
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma1 0.31 0.01 0.29 0.34 1.00 19749 11075
## sigma2 0.47 0.04 0.40 0.55 1.00 7611 9448
## theta1 0.68 0.03 0.62 0.73 1.00 21315 12633
## theta2 0.32 0.03 0.27 0.38 1.00 21315 12633
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
loo_mln1 = loo(mln1)
loo_mln2 = loo(mln2)
loo_mln3 = loo(mln3)
loo_mln4 = loo(mln4, reloo=TRUE)
loo_compare(loo_mln1, loo_mln2, loo_mln3, loo_mln4)
## elpd_diff se_diff
## mln4 0.0 0.0
## mln2 -116.0 21.8
## mln1 -170.3 21.5
## mln3 -281.7 9.6
Based on the shape of the data, we suspect that some of the samples come from projects or commits with very large commit sizes that take extremely long to parse. To model this behavior, we also build some mixture models combining a model that, presumably, will model the normal cases and a second model that will follow the more extreme cases.
\[ \begin{split} \mathcal{M}_1: \mathrm{Runtim{ei}} \thicksim &\ \mathrm{Mixture}(\mathrm{Shifted-Lognormal}(\mu_{1i}, \sigma_{1i}, ndt_{1i}), \mathrm{Shifted-Lognormal}(\mu_{2i}, \sigma_{2i}, ndt_{2i})) \\ \mu_{1i} = &\ \alpha_1 + \beta_{A_1} \mathrm{Algorithm} + \beta_{L_1} \mathrm{LOC}\\ \mu_{2i} = &\ \alpha_2 + \beta_{A_2} \mathrm{Algorithm} + \beta_{L_2} \mathrm{LOC}\\ \alpha_1, \alpha_2 \thicksim &\ \mathrm{Normal(0,5)} \\ \beta_{A_1}, \beta_{A_2}, \beta_{L_1}, \beta_{L_2} \thicksim &\ \mathrm{Normal(0, 2)} \\ \sigma_{1i}, \sigma_{2i} \thicksim &\ \mathrm{gamma(0.1, 0.01)} \\ ndt_{1i} \thicksim &\ \mathrm{Uniform(0,0.005)} \\ ndt_{2i} \thicksim &\ \mathrm{Uniform(0,210)} \end{split} \]
msln3p = brm(
bf(Runtime ~ 1 + Algorithm + LOC + FixCount + (1 | Project)),
d,
family = mixture(shifted_lognormal(), shifted_lognormal()),
prior = c(
prior(normal(-2, 2), class = "Intercept", dpar = "mu1"),
prior(normal(2, 2), class = "Intercept", dpar = "mu2"),
prior(normal(0, 4), class = "b", dpar ="mu1"),
prior(normal(0, 4), class = "b", dpar ="mu2"),
prior(uniform(0, 0.005), class = "ndt1"),
prior(uniform(0, 210), class = "ndt2"),
prior(gamma(0.1, 0.01), class = "sd", dpar = "mu1"),
prior(gamma(0.1, 0.01), class = "sd", dpar = "mu2"),
prior(gamma(0.1, 0.01), class = "sigma1"),
prior(gamma(0.1, 0.01), class = "sigma2")
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = "only",
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
pp_check(msln3p, nsamples = 100) + scale_x_continuous(trans='log10')
msln3 = brm(
bf(Runtime ~ 1 + Algorithm + LOC + FixCount + (1 | Project)),
d,
family = mixture(shifted_lognormal(), shifted_lognormal()),
prior = c(
prior(normal(-2, 2), class = "Intercept", dpar = "mu1"),
prior(normal(2, 2), class = "Intercept", dpar = "mu2"),
prior(normal(0, 4), class = "b", dpar ="mu1"),
prior(normal(0, 4), class = "b", dpar ="mu2"),
prior(uniform(0, 0.005), class = "ndt1"),
prior(uniform(0, 210), class = "ndt2"),
prior(gamma(0.1, 0.01), class = "sd", dpar = "mu1"),
prior(gamma(0.1, 0.01), class = "sd", dpar = "mu2"),
prior(gamma(0.1, 0.01), class = "sigma1"),
prior(gamma(0.1, 0.01), class = "sigma2")
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = FALSE,
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
summary(msln3)
## Family: mixture(shifted_lognormal, shifted_lognormal)
## Links: mu1 = identity; sigma1 = identity; ndt1 = identity; mu2 = identity; sigma2 = identity; ndt2 = identity; theta1 = identity; theta2 = identity
## Formula: Runtime ~ 1 + Algorithm + LOC + FixCount + (1 | Project)
## Data: d (Number of observations: 480)
## Samples: 4 chains, each with iter = 5000; warmup = 1000; thin = 1;
## total post-warmup samples = 16000
##
## Group-Level Effects:
## ~Project (Number of levels: 32)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(mu1_Intercept) 0.51 0.08 0.38 0.69 1.00 3117 5884
## sd(mu2_Intercept) 0.45 0.11 0.25 0.68 1.00 3530 4993
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## mu1_Intercept -1.97 0.10 -2.15 -1.78 1.00 2122
## mu2_Intercept 3.33 0.11 3.11 3.55 1.00 7528
## mu1_AlgorithmBugspots -2.71 0.10 -2.93 -2.53 1.00 3998
## mu1_LOC 0.27 0.05 0.17 0.38 1.00 7311
## mu1_FixCount 0.08 0.03 0.03 0.14 1.00 11536
## mu2_AlgorithmBugspots -8.17 0.20 -8.57 -7.78 1.00 5876
## mu2_LOC 0.14 0.08 -0.01 0.30 1.00 7333
## mu2_FixCount -0.08 0.09 -0.25 0.11 1.00 10368
## Tail_ESS
## mu1_Intercept 4355
## mu2_Intercept 10219
## mu1_AlgorithmBugspots 6224
## mu1_LOC 7533
## mu1_FixCount 11930
## mu2_AlgorithmBugspots 9183
## mu2_LOC 9266
## mu2_FixCount 11260
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma1 0.35 0.03 0.30 0.41 1.00 4250 6721
## ndt1 0.00 0.00 0.00 0.00 1.00 4278 6515
## sigma2 0.55 0.06 0.45 0.67 1.00 4849 9019
## ndt2 0.00 0.00 0.00 0.00 1.00 7805 9077
## theta1 0.66 0.03 0.60 0.71 1.00 8923 11386
## theta2 0.34 0.03 0.29 0.40 1.00 8923 11386
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Results:
pp_check(msln3, nsamples = 100) + scale_x_continuous(trans='log10')
mcmc_areas(msln3, pars = c("b_mu1_AlgorithmBugspots", "b_mu2_AlgorithmBugspots"), prob = 0.95)
eff = conditional_effects(msln3, effects = c("Algorithm"))
eff$Algorithm
eff
And Diagnostics: All diagnostics look good.
min(neff_ratio(msln3))
## [1] 0.1322605
max(rhat(msln3))
## [1] 1.001518
plot(msln3)
msln1p = brm(
bf(Runtime ~ 1 + Algorithm + LOC),
d,
family = mixture(shifted_lognormal(), shifted_lognormal()),
prior = c(
prior(normal(-2, 2), class = "Intercept", dpar = "mu1"),
prior(normal(2, 2), class = "Intercept", dpar = "mu2"),
prior(normal(0, 4), class = "b", dpar ="mu1"),
prior(normal(0, 4), class = "b", dpar ="mu2"),
prior(uniform(0, 0.005), class = "ndt1"),
prior(uniform(0, 210), class = "ndt2"),
prior(gamma(0.1, 0.01), class = "sigma1"),
prior(gamma(0.1, 0.01), class = "sigma2")
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = "only",
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
pp_check(msln1p, nsamples = 100) + scale_x_continuous(trans='log10')
msln1 = brm(
bf(Runtime ~ 1 + Algorithm + LOC),
d,
family = mixture(shifted_lognormal(), shifted_lognormal()),
prior = c(
prior(normal(-2, 2), class = "Intercept", dpar = "mu1"),
prior(normal(2, 2), class = "Intercept", dpar = "mu2"),
prior(normal(0, 4), class = "b", dpar ="mu1"),
prior(normal(0, 4), class = "b", dpar ="mu2"),
prior(uniform(0, 0.005), class = "ndt1"),
prior(uniform(0, 210), class = "ndt2"),
prior(gamma(0.1, 0.01), class = "sigma1"),
prior(gamma(0.1, 0.01), class = "sigma2")
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = FALSE,
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
summary(msln1)
## Family: mixture(shifted_lognormal, shifted_lognormal)
## Links: mu1 = identity; sigma1 = identity; ndt1 = identity; mu2 = identity; sigma2 = identity; ndt2 = identity; theta1 = identity; theta2 = identity
## Formula: Runtime ~ 1 + Algorithm + LOC
## Data: d (Number of observations: 480)
## Samples: 4 chains, each with iter = 5000; warmup = 1000; thin = 1;
## total post-warmup samples = 16000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## mu1_Intercept -1.98 0.05 -2.08 -1.89 1.00 23228
## mu2_Intercept 3.35 0.07 3.21 3.50 1.00 21646
## mu1_AlgorithmBugspots -2.95 0.12 -3.17 -2.72 1.00 6394
## mu1_LOC 0.33 0.04 0.25 0.41 1.00 11996
## mu2_AlgorithmBugspots -8.30 0.23 -8.78 -7.88 1.00 6329
## mu2_LOC 0.09 0.06 -0.03 0.20 1.00 15658
## Tail_ESS
## mu1_Intercept 11474
## mu2_Intercept 12134
## mu1_AlgorithmBugspots 9892
## mu1_LOC 10586
## mu2_AlgorithmBugspots 8775
## mu2_LOC 11069
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma1 0.58 0.03 0.52 0.64 1.00 11707 11748
## ndt1 0.00 0.00 0.00 0.00 1.00 6882 6927
## sigma2 0.70 0.05 0.61 0.81 1.00 11714 9788
## ndt2 0.00 0.00 0.00 0.00 1.00 6341 7006
## theta1 0.65 0.03 0.59 0.70 1.00 16384 11384
## theta2 0.35 0.03 0.30 0.41 1.00 16384 11384
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
pp_check(msln1, nsamples = 100) + scale_x_continuous(trans='log10')
msln2p = brm(
bf(Runtime ~ 1 + Algorithm + LOC + FixCount),
d,
family = mixture(shifted_lognormal(), shifted_lognormal()),
prior = c(
prior(normal(-2, 2), class = "Intercept", dpar = "mu1"),
prior(normal(2, 2), class = "Intercept", dpar = "mu2"),
prior(normal(0, 4), class = "b", dpar ="mu1"),
prior(normal(0, 4), class = "b", dpar ="mu2"),
prior(uniform(0, 0.005), class = "ndt1"),
prior(uniform(0, 210), class = "ndt2"),
prior(gamma(0.1, 0.01), class = "sigma1"),
prior(gamma(0.1, 0.01), class = "sigma2")
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = "only",
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
pp_check(msln2p, nsamples = 100) + scale_x_continuous(trans='log10')
msln2 = brm(
bf(Runtime ~ 1 + Algorithm + LOC + FixCount),
d,
family = mixture(shifted_lognormal(), shifted_lognormal()),
prior = c(
prior(normal(-2, 2), class = "Intercept", dpar = "mu1"),
prior(normal(2, 2), class = "Intercept", dpar = "mu2"),
prior(normal(0, 4), class = "b", dpar ="mu1"),
prior(normal(0, 4), class = "b", dpar ="mu2"),
prior(uniform(0, 0.005), class = "ndt1"),
prior(uniform(0, 210), class = "ndt2"),
prior(gamma(0.1, 0.01), class = "sigma1"),
prior(gamma(0.1, 0.01), class = "sigma2")
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = FALSE,
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
summary(msln2)
## Family: mixture(shifted_lognormal, shifted_lognormal)
## Links: mu1 = identity; sigma1 = identity; ndt1 = identity; mu2 = identity; sigma2 = identity; ndt2 = identity; theta1 = identity; theta2 = identity
## Formula: Runtime ~ 1 + Algorithm + LOC + FixCount
## Data: d (Number of observations: 480)
## Samples: 4 chains, each with iter = 5000; warmup = 1000; thin = 1;
## total post-warmup samples = 16000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## mu1_Intercept -2.00 0.05 -2.09 -1.90 1.00 25897
## mu2_Intercept 3.32 0.08 3.15 3.48 1.00 17377
## mu1_AlgorithmBugspots -2.96 0.11 -3.18 -2.74 1.00 6844
## mu1_LOC 0.33 0.04 0.25 0.41 1.00 11323
## mu1_FixCount 0.05 0.04 -0.03 0.12 1.00 12690
## mu2_AlgorithmBugspots -8.19 0.23 -8.70 -7.78 1.00 6940
## mu2_LOC 0.09 0.06 -0.02 0.20 1.00 14572
## mu2_FixCount -0.08 0.09 -0.25 0.09 1.00 10438
## Tail_ESS
## mu1_Intercept 11370
## mu2_Intercept 13421
## mu1_AlgorithmBugspots 8679
## mu1_LOC 10022
## mu1_FixCount 11765
## mu2_AlgorithmBugspots 7873
## mu2_LOC 10920
## mu2_FixCount 11046
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma1 0.58 0.03 0.52 0.64 1.00 9496 10100
## ndt1 0.00 0.00 0.00 0.00 1.00 6291 5797
## sigma2 0.69 0.05 0.60 0.80 1.00 10444 11537
## ndt2 0.00 0.00 0.00 0.00 1.00 7482 6796
## theta1 0.65 0.03 0.59 0.71 1.00 15415 11996
## theta2 0.35 0.03 0.29 0.41 1.00 15415 11996
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
pp_check(msln2, nsamples = 100) + scale_x_continuous(trans='log10')
msln4p = brm(
bf(Runtime ~ 1 + Algorithm + LOC + FixCount + (1 | Project) + (1 | language)),
d,
family = mixture(shifted_lognormal(), shifted_lognormal()),
prior = c(
prior(normal(-2, 2), class = "Intercept", dpar = "mu1"),
prior(normal(2, 2), class = "Intercept", dpar = "mu2"),
prior(normal(0, 4), class = "b", dpar ="mu1"),
prior(normal(0, 4), class = "b", dpar ="mu2"),
prior(uniform(0, 0.005), class = "ndt1"),
prior(uniform(0, 210), class = "ndt2"),
prior(gamma(0.1, 0.01), class = "sd", dpar = "mu1"),
prior(gamma(0.1, 0.01), class = "sd", dpar = "mu2"),
prior(gamma(0.1, 0.01), class = "sigma1"),
prior(gamma(0.1, 0.01), class = "sigma2")
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = "only",
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
pp_check(msln4p, nsamples = 100) + scale_x_continuous(trans='log10')
msln4 = brm(
bf(Runtime ~ 1 + Algorithm + LOC + FixCount + (1 | Project) + (1 | language)),
d,
family = mixture(shifted_lognormal(), shifted_lognormal()),
prior = c(
prior(normal(-2, 2), class = "Intercept", dpar = "mu1"),
prior(normal(2, 2), class = "Intercept", dpar = "mu2"),
prior(normal(0, 4), class = "b", dpar ="mu1"),
prior(normal(0, 4), class = "b", dpar ="mu2"),
prior(uniform(0, 0.005), class = "ndt1"),
prior(uniform(0, 210), class = "ndt2"),
prior(gamma(0.1, 0.01), class = "sd", dpar = "mu1"),
prior(gamma(0.1, 0.01), class = "sd", dpar = "mu2"),
prior(gamma(0.1, 0.01), class = "sigma1"),
prior(gamma(0.1, 0.01), class = "sigma2")
),
init = 0,
iter = SAMPLES,
warmup = WARMUP,
chains = CHAINS,
cores = parallel::detectCores(),
sample_prior = FALSE,
control = list(adapt_delta = DELTA, max_treedepth = TREE),
seed = SEED
)
summary(msln4)
## Family: mixture(shifted_lognormal, shifted_lognormal)
## Links: mu1 = identity; sigma1 = identity; ndt1 = identity; mu2 = identity; sigma2 = identity; ndt2 = identity; theta1 = identity; theta2 = identity
## Formula: Runtime ~ 1 + Algorithm + LOC + FixCount + (1 | Project) + (1 | language)
## Data: d (Number of observations: 480)
## Samples: 4 chains, each with iter = 5000; warmup = 1000; thin = 1;
## total post-warmup samples = 16000
##
## Group-Level Effects:
## ~language (Number of levels: 8)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(mu1_Intercept) 0.05 0.10 0.00 0.32 1.01 484 475
## sd(mu2_Intercept) 0.22 0.23 0.00 0.72 1.01 588 1722
##
## ~Project (Number of levels: 32)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(mu1_Intercept) 0.51 0.08 0.37 0.69 1.00 2677 3972
## sd(mu2_Intercept) 0.34 0.18 0.00 0.63 1.00 448 613
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## mu1_Intercept -1.97 0.10 -2.17 -1.76 1.00 2115
## mu2_Intercept 3.32 0.15 3.02 3.62 1.00 9783
## mu1_AlgorithmBugspots -2.70 0.10 -2.92 -2.53 1.00 2746
## mu1_LOC 0.27 0.05 0.17 0.37 1.00 6353
## mu1_FixCount 0.08 0.03 0.02 0.14 1.00 9980
## mu2_AlgorithmBugspots -8.20 0.20 -8.58 -7.80 1.00 2710
## mu2_LOC 0.16 0.08 0.00 0.31 1.00 6020
## mu2_FixCount -0.08 0.09 -0.25 0.10 1.00 3843
## Tail_ESS
## mu1_Intercept 4108
## mu2_Intercept 7642
## mu1_AlgorithmBugspots 6659
## mu1_LOC 9218
## mu1_FixCount 10798
## mu2_AlgorithmBugspots 7790
## mu2_LOC 10580
## mu2_FixCount 9911
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma1 0.35 0.03 0.30 0.41 1.00 2118 5687
## ndt1 0.00 0.00 0.00 0.00 1.00 3815 6760
## sigma2 0.56 0.06 0.45 0.69 1.00 856 2254
## ndt2 0.00 0.00 0.00 0.00 1.00 3140 7243
## theta1 0.66 0.03 0.59 0.72 1.00 2555 9962
## theta2 0.34 0.03 0.28 0.41 1.00 2555 9962
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
pp_check(msln4, nsamples = 100) + scale_x_continuous(trans='log10')
loo_msln1 = loo(msln1)
loo_msln2 = loo(msln2, reloo=TRUE)
loo_msln3 = loo(msln3, reloo=TRUE)
loo_msln4 = loo(msln4, reloo=TRUE)
loo_compare(loo_msln1, loo_msln2, loo_msln3, loo_msln4)
## elpd_diff se_diff
## msln3 0.0 0.0
## msln4 -1.2 1.6
## msln1 -93.2 17.0
## msln2 -94.4 17.1
loo_compare(loo_ln2, loo_sln2, loo_mln3, loo_msln3)
## elpd_diff se_diff
## msln3 0.0 0.0
## mln3 -277.0 8.7
## sln2 -418.7 19.6
## ln2 -539.8 19.0
Results:
pp_check(msln3, nsamples = 100) + scale_x_continuous(trans='log10')
mcmc_areas(msln3, pars = c("b_mu1_AlgorithmBugspots", "b_mu2_AlgorithmBugspots"), prob = 0.95)
eff = conditional_effects(msln3, effects = c("Algorithm"))
eff$Algorithm
eff
And Diagnostics:
min(neff_ratio(msln3))
## [1] 0.1322605
max(rhat(msln3))
## [1] 1.001518
plot(msln3)
Bürkner, Paul-Christian. 2020. Brms: Bayesian Regression Models Using ’Stan’. https://CRAN.R-project.org/package=brms.
Gabry, Jonah, and Tristan Mahr. 2020. Bayesplot: Plotting for Bayesian Models. https://CRAN.R-project.org/package=bayesplot.